docs: advanced guides for conflict resolution and error handling (#191)#302

Open

halotukozak wants to merge 1 commit intomasterfrom

issue-191-guides

Owner

halotukozak commented Mar 4, 2026

Summary

Add advanced guides for conflict resolution, contextual parsing and lexer error handling

🤖 Generated with Claude Code


          docs: add advanced guides for conflict resolution, contextual parsing…

a5b61fd

… and lexer error handling

Copilot AI review requested due to automatic review settings

March 4, 2026 14:49

github-actions bot added documentation error-handling labels

Copilot started reviewing on behalf of halotukozak

March 4, 2026 14:50

github-actions bot commented Mar 4, 2026

🏃 Runtime Benchmark

Benchmark	Base (master)	Current (issue-191-guides)	Diff

codecov bot commented Mar 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@            Coverage Diff            @@
##             master     #302   +/-   ##
=========================================
  Coverage          ?   42.03%           
=========================================
  Files             ?       35           
  Lines             ?      433           
  Branches          ?        0           
=========================================
  Hits              ?      182           
  Misses            ?      251           
  Partials          ?        0

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR adds three new “advanced guides” to the Alpaca documentation, covering parser conflict resolution, contextual parsing via lexer/parser context, and strategies for lexer error handling.

Changes:

Add a conflict resolution guide explaining shift/reduce + reduce/reduce conflicts and Alpaca’s before/after DSL.
Add a contextual parsing guide describing LexerCtx, ParserCtx, and context-driven lexing patterns.
Add a lexer error handling guide describing catch-all token strategies and continuing after invalid input.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
docs/_docs/guides/lexer-error-handling.md	New guide for resilient lexing patterns (catch-all token, counting errors, ignoring invalid chars).
docs/_docs/guides/contextual-parsing.md	New guide for context-driven lexing/parsing and how state flows through lexer → lexemes → parser.
docs/_docs/guides/conflict-resolution.md	New guide describing conflict types and how to resolve them using Alpaca’s conflict resolution DSL.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/_docs/guides/lexer-error-handling.md

Comment on lines +38 to +50

+              case class ErrorCtx(
+                var text: CharSequence = "",
+                var errorCount: Int = 0
+              ) extends LexerCtx
+              val myLexer = lexer[ErrorCtx]:
+                case "[a-z]+" => Token["ID"]
+                case "\\s+" => Token.Ignored
+                case x @ "." =>
+                  ctx.errorCount += 1
+                  println(s"Error: Unexpected character '$x' at position ${ctx.position}")
+                  Token.Ignored // Skip the character

Copilot AI Mar 4, 2026

The ErrorCtx example logs ${ctx.position}, but position is not a member of LexerCtx unless the context mixes in PositionTracking (or uses LexerCtx.Default). Update the example context definition accordingly so it compiles and matches the described behavior.

Copilot uses AI. Check for mistakes.

docs/_docs/guides/contextual-parsing.md

Comment on lines +23 to +26

+                case "\(" =>
+                  ctx.stack.push("paren")
+                  Token["("]
+                case "\)" =>

Copilot AI Mar 4, 2026

These lexer patterns use "\(" and "\)" (single backslash). In Scala string literals \( / \) are invalid escape sequences; if the intent is to match literal parentheses in a regex, the strings should be escaped as "\\(" and "\\)" (or written using triple-quoted strings).

Suggested change

      
              case "\(" => 
          
                ctx.stack.push("paren")
          
                Token["("]
          
              case "\)" => 
          
              case "\\(" => 
          
                ctx.stack.push("paren")
          
                Token["("]
          
              case "\\)" =>

Copilot uses AI. Check for mistakes.

docs/_docs/guides/contextual-parsing.md

Comment on lines +44 to +45

		case x @ "
		+" =>

Copilot AI Mar 4, 2026

The indentation lexer example has a broken multi-line string literal for the newline+spaces pattern (case x @ " on one line and +" on the next). As written, this is not valid Scala and will be confusing to readers; represent the pattern as a valid single-line string (e.g., using \n and escaped backslashes) or a properly delimited triple-quoted string.

Suggested change

      
              case x @ "
          
             +" =>
          
              case x @ "\\n +" =>

Copilot uses AI. Check for mistakes.

docs/_docs/guides/contextual-parsing.md

Comment on lines +68 to +70

+                    // id is a Lexeme, which has a .fields property
+                    // fields contains all members of your LexerCtx
+                    println(s"Matched ID at line ${id.fields.line}")

Copilot AI Mar 4, 2026

This section states that a Lexeme has a .fields property and shows id.fields.line, but fields is not publicly accessible on alpaca.internal.lexer.Lexeme (it’s private[alpaca]). Readers should access captured context fields via the lexeme’s dynamic members (e.g., id.line, id.position, id.text) or whatever the intended public API is.

Suggested change

      
                  // id is a Lexeme, which has a .fields property
          
                  // fields contains all members of your LexerCtx
          
                  println(s"Matched ID at line ${id.fields.line}")
          
                  // id is a Lexeme; captured context fields are exposed as dynamic members
          
                  // e.g. if your LexerCtx has a `line` field, you can access it as `id.line`
          
                  println(s"Matched ID at line ${id.line}")

Copilot uses AI. Check for mistakes.

docs/_docs/guides/contextual-parsing.md

Comment on lines +102 to +107

+                case """ =>
+                  ctx.inString = !ctx.inString
+                  Token["QUOTE"]
+                case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
+                case "[^"]+" if ctx.inString  => Token["STRING_CONTENT"]

Copilot AI Mar 4, 2026

The mode-switching example has invalid Scala string literals: case """ => is an unterminated triple-quoted string, and the regex pattern "[^"]+" contains an unescaped quote. Please rewrite these patterns using valid Scala literals (often easiest with properly delimited triple-quoted strings) so the example can be copied verbatim.

Suggested change

      
              case """ => 
          
                ctx.inString = !ctx.inString
          
                Token["QUOTE"]
          
              case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
          
              case "[^"]+" if ctx.inString  => Token["STRING_CONTENT"]
          
              case "\"" => 
          
                ctx.inString = !ctx.inString
          
                Token["QUOTE"]
          
              case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
          
              case """[^"]+""" if ctx.inString  => Token["STRING_CONTENT"]

Copilot uses AI. Check for mistakes.

docs/_docs/guides/contextual-parsing.md

Comment on lines +110 to +125

+              ## 5. The `BetweenStages` Hook
+              The `BetweenStages` hook is the internal engine that powers context updates. It is a function called by Alpaca after **every** token match (including `Token.Ignored`) but **before** the next match starts.
+              ### Automatic Updates
+              By default, Alpaca uses `BetweenStages` to automatically update the `text` field in your context. If your context extends `LineTracking` or `PositionTracking`, it also increments `line` and `position` counters.
+              ### Customizing `BetweenStages`
+              If you need complex logic to run after every match, you can provide a custom `given` instance of `BetweenStages`.
+              ```scala
+              given MyBetweenStages: BetweenStages[MyCtx] with
+                def apply(token: Token[?, MyCtx, ?], matcher: Matcher, ctx: MyCtx): Unit =
+                  // Custom global logic
+                  println(s"Just matched ${token.info.name}")
+              ```

Copilot AI Mar 4, 2026

The guide suggests customizing BetweenStages via a user-provided given, but BetweenStages is currently declared private[alpaca] (see src/alpaca/internal/lexer/BetweenStages.scala), so downstream users can’t reference or implement it. Either expose BetweenStages as part of the public API (or provide a public hook) or adjust the documentation to reflect the supported customization mechanisms (e.g., mixing in LineTracking/PositionTracking).

Copilot uses AI. Check for mistakes.

docs/_docs/guides/lexer-error-handling.md

+              ```scala
+              val resilientLexer = lexer:
+                case "[0-9]+" => Token["NUM"]
+                case "\s+" => Token.Ignored

Copilot AI Mar 4, 2026

In the resilient lexer example, the whitespace regex is written as "\s+" (single backslash). In a Scala string literal this is an invalid escape sequence; use "\\s+" (or a triple-quoted string) to represent the \s+ regex correctly.

Suggested change

      
              case "\s+" => Token.Ignored
          
              case "\\s+" => Token.Ignored

Copilot uses AI. Check for mistakes.

github-actions bot commented Mar 4, 2026

📊 Test Compilation Benchmark

Branch	Average Time
Base (master)	48.715s
Current (issue-191-guides)	52.942s

Result: Current branch is 4.227s slower (8.68%) ⚠️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation error-handling